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Art Unit: 2655 

DETAILED ACTION 

Response to Amendment 

1 . In response to the Office Action of November 6, 2003, Applicants have submitted 
an Amendment, filed February 24, 2004, amending claims 1 and 5, and adding claims 

1 1-18 and arguing to overcome the art rejections. Claims 1-18 are pending in this 
application. Of the pending claims, claims 1 and 11 are independent claims. 

Response to Arguments 

2. Applicant's arguments with respect to claims 1-10 have been considered but are 
moot in view of the new ground(s) of rejection. 

3. This is a non-final Office Action in view of new ground(s) of rejection not 
necessitated by Applicant's amendment. 

4. Applicants cited that under 35 U.S.C. 103 and MPEP 706.02(l)(1), U.S. Patent 
6,141 ,644 is disqualified as a prior art reference. The examiner respectfully points out 
that the reference is qualified as a prior art reference under the 103/1 02(a) criteria 
instead of 103/1 02(e). 

Claim Rejections = 35 USC § 103 

5. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
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invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 



6. Claims 1-8, 11-13, and 16-20 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Acero et al. ("Speaker and gender normalization for continuous density Hidden Markov Models," May '96, 
hereinafter "Acero") in view of Kuhn ("Eigenvoices for speaker adaptation," Nov. '98, hereinafter "Kuhn"). 


Claim(s) 
1 


Acero shows: 

A method for developing context dependent acoustic models, comprising the steps of: 

representing the training speech data from each of said plurality of training speakers 
(p. 342, coM , §1 , 1 st <(J : "The error rate...") as the combination of a speaker dependent 
component (e.g., speaker dependent Gaussian mean vector n : p. 343, col.1, 1 st % eq.2) and 
a speaker independent component (e.g., speaker independent delta 8 : p. 343, col.1, 1 st % 
eq.2); 

representing said speaker dependent component as centroids (e.g., Gaussian mean 
vector , p.343, col.1, 1 st % eq.2); 

representing said speaker independent component as linear transformations (delta 5 : 
p.343, col.1, 1 st % eq.2; ) of said centroids; and 

{The transformation is linear, i.e., linear combination, (p. 342, col. 2, 1 st ft "The use of 
correlation . . .; p. 343, eq. 2)} 

performing maximum likelihood re-estimation (e.g., iterative estimation: p.343, §2.2) 
on said training speech data of at least one of said low-dimensional space, said centroids 
(mean vector \x ), and said linear transformations (delta 8 : p.343, col.1 , 1 st % eq.2) to 
represent context dependent acoustic model. 

Acero does not show: 

developing a low-dimensional space from training speech data obtained from a 
plurality of training speakers. 

Kuhn teaches: 
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developing a low-dimensional space (K-space) from training speech data obtained 
from a plurality of training speakers, (see Abstract) 

It would have been obvious to a person of ordinary skill in the art at the time the 
invention was made to modify the acoustic model of Acero to include the low-dimensional 
space teaching of Kuhn in order to provide a fast speaker adaptation technique. The use of 
low-dimensional space reduces and saves computational time to produce a result 
representative of the original dimension and thus, leads to a faster speaker adaptation 
technique. 


Claim(s) 
2 


Acero shows: 

The method of claim 1 wherein said training speech data is separated by identifying 
context dependent data and using said context dependent data to identify said speaker 
independent data, (see Abstract; §2 on p.342-343) 
{The delta parameter 8 (offset) is speaker independent, context-dependent} 


C!aim(s) 
3 


Acero shows: 

The method of claim 1 wherein said training speech data is separated by identifying 
context independent data and using said context independent data to identify said speaker 
dependent data. (§2 on p.342-343 ) 

{The Gaussian mean vector // is speaker-cluster-dependent, context-independent.} 


Claim(s) 
4 


Acero shows: 

The method of claim 1 wherein said maximum likelihood re-estimation step is 
performed iteratively. (p.343, §2.2) 


Claim(s) 
5 


Acero shows: 

The method of claim 1 wherein said linear transformations are effected as offsets (8) 
from said centroids {\x). (p. 342, col.2, 1 st *ff: "The use of correlation..."; p.343, col.1, eq.2) 
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Claim(s) 
6 


Acero shows: 

The method of claim 1 wherein said maximum likelihood re-estimation step generates 
a re-estimated acoustic space, a re-estimated centroids and re-estimated offsets and wherein 
said context dependent acoustic models are constructed using said re-estimated low- 
dimensional space and said re-estimated offsets, (see Abstract; p.343, §2.2) 

Acero does not show: 

The acoustic space is a low-dimensional space. 
Kuhn teaches: 

developing a low-dimensional space (K-space) from training speech data, (see 
Abstract) 

It would have been obvious to a person of ordinary skill in the art at the time the 
invention was made to modify the acoustic model of Acero to include the low-dimensional 
space teaching of Kuhn in order to provide a fast speaker adaptation technique. The use of 
low-dimensional space reduces and saves computational time to produce a result 
representative of the original dimension and thus, leads to a faster speaker adaptation 
technique. 


Claim(s) 
7 


Acero shows: 

The method of claim 1 wherein said linear transformations of said centroids (mean 
vector are represented in tree data structures (tree hierarchy: p.343, Fig.1) corresponding 
to individual sound units (e.g., phonetics: p.342, col. 2, §2, part 1). 


Claim(s) 
8 


Acero shows: 

The method of claim 5 wherein said offsets (delta 5) are represented in tree data 
structures (tree hierarchy: p.343, Fig.1) corresponding to individual sound units (e.g., 
phonetics: p.342, col.2, §2, part 1). 
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Claim(s) 
11 



Acero shows: 



A method for developing context dependent acoustic models, comprising the steps of: 



representing the training speech data from each of said plurality of training speakers 
(p.342, col.1, §1 , 1 st : 'The error rate...") as the combination of a speaker dependent 
component (e.g., speaker dependent Gaussian mean vector \x : p. 343, col.1, 1 st % eq.2)and 
a speaker independent component (e.g., speaker independent delta 5 : p. 343, col.1, 1 st % 
eq.2); 



representing said speaker dependent component as centroids (e.g., Gaussian mean 
vector n, p.343, col.1, 1 st % eq.2); 



representing said speaker independent component as linear transformations (delta 5 : 
p.343, col.1, 1 st % eq.2; ) of said centroids; and 

(The transformation is linear, i.e., linear combination, (p.342, col.2 f 1 st ft "The use of 
correlation p. 343, eq. 2)} 

performing maximum likelihood re-estimation (e.g., iterative estimation: p.343, §2.2) 
on said training speech data of at least one of said low-dimensional space, said centroids 
(mean vector \i ), and said linear transformations (delta 5 : p.343, col.1 , 1 st % eq.2) to 
represent context dependent acoustic model, wherein said linear transformations are effected 
as offsets (delta 5) from said centroids (mean vector |i), wherein said maximum likelihood re- 
estimation step generates a re-estimated acoustic space, a re-estimated centroids and re- 
estimated offsets and wherein said context dependent acoustic models are constructed using 
said re-estimated low-dimensional space and said re-estimated offsets, (see Abstract; p.343, 
§2.2) 



Acero does not show: 



developing a low-dimensional space from training speech data obtained from a 
plurality of training speakers. 



Kuhn teaches: 
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developing a low-dimensional space (K-space) from training speech data obtained 
from a plurality of training speakers, (see Abstract) 

It would have been obvious to a person of ordinary skill in the art at the time the 
invention was made to modify the acoustic model of Acero to include the low-dimensional 
space teaching of Kuhn in order to provide a fast speaker adaptation technique. The use of 
low-dimensional space reduces and saves computational time to produce a result 
representative of the original dimension and thus, leads to a faster speaker adaptation 
technique. 


Claim(s) 
12 


Acero shows: 

The method of claim 1 1 wherein said linear transformations of said centroids (mean 
vector \x) are represented in tree data structures (tree hierarchy: p.343, Fig.1) corresponding 
to individual sound units (e.g., phonetics: p.342, col. 2, §2, part 1). 


Claim(s) 
13 


Acero shows: 

The method of claim 1 1 wherein said offsets (delta 8) are represented in tree data 
structures (tree hierarchy: p.343, Fig. 1) corresponding to individual sound units (e.g., 
phonetics: p.342, col.2, §2, part 1). 


Claim(s) 
16 


Acero shows: 

The method of claim 1 1 wherein said training speech data is separated by identifying 
context dependent data and using said context dependent data to identify said speaker 
independent data, (see Abstract; §2 on p.342-343) 
{The delta parameter S (offset) is speaker independent, context-dependent} 


Claim(s) 
17 


Acero shows: 

The method of claim 1 1 wherein said training speech data is separated by identifying 
context independent data and using said context independent data to identify said speaker 
dependent data. (§2 on p.342-343 ) 

{The Gaussian mean vector p is speaker-cluster-dependent, context-independent} 
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Claim(s) 
18 


Acero shows: 

The method of claim 1 1 wherein said maximum likelihood re-estimation step is 
performed iteratively. (p.343 f §2.2) 



7. Claims 9-10 and 14-15 are rejected under 35 U.S.C. 103(a) as being unpatentable over Acero in 
view of Kuhn, and further in view of Kuhn et al. (U.S. Patent 6,141,644, hereinafter "Kuhn[2]"). 


Claim(s) 
9 


The modified Acero does not show: 
The method of claim 1 further comprising: 

using said speaker dependent component to perform speaker verification. 
Kuhn[21 teaches: 

using speaker dependent component to perform speaker verification, (col.6, IL47-58) 

It would have obvious to a person of ordinary skill in the art at the time of the 
invention was made to include the speaker verification method of Kuhn[2] in the acoustic 
modeling of the modified Acero in order to provide a improved method of authentication of the 
users in application such conducting financial transactions over the telephone (Kuhn[2], col.1, 
L10-25). 


Claim(s) 
10 


The modified Acero does not show: 
The method of claim 1 further comprising: 

using said speaker dependent component to perform speaker identification. 
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Kuhnr21 teaches: 

using speaker dependent component to perform speaker identification, (col. 6, II.47- 

58) 

It would have obvious to a person of ordinary skill in the art at the time of the 

IMVoilUUll WcJo IlloUO IU IMOiUUO U It? opcdNCI lUCMUl lUdUUI 1 IlldllUU Ul r\UiMl[^J III IMC dLrUUbllU 

modeling of the modified Acero in order to provide a improved method of authentication of the 
users in application such conducting financial transactions over the telephone (Kuhn[2], col.1, 
L.10-25). 


Claim(s) 
14 


The modified Acero does not show: 

The method of claim 1 1 further comprising: 

using said speaker dependent component to perform speaker verification. 
Kuhnl"21 teaches: 

using speaker dependent component to perform speaker verification, (col.6, II.47-58) 

It would have obvious to a person of ordinary skill in the art at the time of the 
invention was maoe to mciuue me speaker veniicaiion meinoo or rsunn^j in ine acoustic 
modeling of the modified Acero in order to provide a improved method of authentication of the 
users in application such conducting financial transactions over the telephone (Kuhn[2], col.1, 
L10-25). 


C!aim(s) 
15 


The modified Acero does not show: 

The method of claim 11 further comprising: 

using said speaker dependent component to perform speaker identification. 
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Kuhnr21 teaches: 

using speaker dependent component to perform speaker identification, (col. 6, 11.47- 

58) 

It would have obvious to a person of ordinary skill in the art at the time of the invention was 
made to include the speaker identification method of Kuhn[2] in the acoustic modeling of the 
modified Acero in order to provide a improved method of authentication of the users in 
application such conducting financial transactions over the telephone (Kuhn[2], col.1 , L.10- 
25). 



Conclusion 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Tim Lao whose telephone number is 703-305-8955. 
The examiner can normally be reached on M-F, 8:30am-5pm. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Doris To can be reached on 703-305-4827. The fax phone number for the 
organization where this application or proceeding is assigned is 703-872-9306. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 



Tim Lao 
Examiner 
Art Unit 2655 
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